home *** CD-ROM | disk | FTP | other *** search
Text File | 2000-05-25 | 62.8 KB | 1,401 lines |
- Mammon_'s Tales to Fravia's Grandson
- ...An IDA Primer...
-
- Contents
- --------
- *Introduction
- *Configuring IDA
- *Loading a program
- *Viewing Imports
- *Viewing Exports
- *Viewing Strings/Resources
- *Searching for Strings/Code
- *Commenting Code
- *Working with IDC scripts
- *Producing an Output File
- *Advanced Techniques
-
- Introduction
- ------------
- Ok, this is a long document for "the basics", mostly due to the Configuration section. New users may
- want to skip this section, or simply apply the changes suggested therein without reading the explanations.
- Also, some parts of the "Advanced Techniques" may get lengthy as well.
-
- Why is IDA so useful? Because it can do anything. IDA will change the way you think about disassemblers; it
- will change the way you think about cracking. W32Dasm? A toy. Soft-Ice? Unnecessary. When you have a disassembler
- that lets you follow the flow of execution by tapping the keyboard, backtrace just as easily, name variables/
- addresses/functions, view the entire program as opcodes or assembly, change code to data and back again according
- to your whim, and even run limited C programs to perform operations on the code from searching and parsing to
- translating and patching...why go somewhere else?
-
- IDA is a reverse engineer's tool. Like many such tools, it is incredibly useful for crackers...yet it is not
- designed for them. It is huge, it is complex, it requires a lot of studying and tuning to get it to perform.
- What follows is an attempt to demonstrate how to get the most out of IDA when getting it "straight out of the
- box": configuration changes are suggested, macros are provided, and a basic tour of using the program in the
- manner of W32Dasm is attempted as well. By the end of this document you should know well IDA's capabilities and
- potential; you should also realize how to track down API calls, string references, and specific opcodes.
-
- As a tool for engineers, IDA requires that you know what you are doing. The more you know, the more you will get
- out of it. At the very least I would recommend reading the PE file format reference at
- http://www.microsoft.com/win32dev/base/pefile.htm
- Cristina Cifuentes' doctoral thesis (selectively, of course) at
- http://www.cs.uq.edu.au/groups/csm/dcc.html#thesis
- and of course the IDA home page itself at
- http://www.unibest.ru/~ig/index.html
- ...That should be enough to get you familiar enough with disassembling and the PE file format to use
- IDA to its greatest potential.
-
- What are all these IDA files? Yes, IDA is huge, and some of the files may be useless to you. Here is a quick
- overview:
- *.CFG -- IDA Configuration Settings
- IDA.KEY -- Registration File
- IDA2.EXE -- OS/2 Executable
- IDAX.EXE -- DOS4/GW Executable
- IDAW.EXE -- Win32 Executable
- IDA.INT -- Auto-generated comments
- *.LDO -- File loader for OS/2 Executable (ex PE.LDO = PE File Loader)
- *.LDX -- File loader for DOS4/GW Executable
- *.LDW -- File loader for Windows Executable
- *.DLL -- Disassembler for OS/2 Executable (ex PC.DLL = PC Disassembler)
- *.D32 -- Disassembler for DOS4/GW Executable
- *.W32 -- Disassembler for Windows Executable
- /IDC -- IDC macro scripts and include files
- /IDS -- IDS files for commenting/naming imports
- /Sig -- FLIRT/Compiler signature files (for recognizing target's compiler)
-
-
- Configuring IDA
- ---------------
- In the \IDA37? directory, locate the file Ida.cfg and open it in any text editor.
- The file is divided into two main sections, First Pass and Second Pass, each of
- which has different configuration options: the first pass contains the file
- extension to processor type associations, the memory and screen configuration,
- OS/2 options, and hotkey definitions; the second pass contains general program
- parameters, code analysis configuration, format options for the code displayed,
- ASCII string display options, displayable characters, macro definitions, and
- processor options.
-
- The areas of the configuration file that you will most likely want to change are:
- *Screen Configuration
- *Format Options (Text Representation)
- *ASCII Display Options
- *Processor Options
-
- Some additional areas that you may want to configure are:
- *Hotkey Definitions
- *Code Analysis Options
- *Displayable Characters
-
- 1. Screen Configuration
- Out of the box, the IDA screen configuration section looks like this:
- ====================================================================
- // Screen configuration (first pass)
- // ---------------------------------
- #ifdef __MSDOS__
- SCREEN_MODE = 0 // Screen mode to use
- // 0 - don't change screen mode
- // DOS: AL for INT 10
- #else
- SCREEN_MODE = 0 // Screen mode to use
- // high byte - cols, low byte - rows
- // i.e. 0x5020 is 80cols, 32rows
- #endif
- SCREEN_PALETTE = 0 // Screen palette:
- // 0 - automatic
- // 1 - B & W
- // 2 - Monochrone
- // 3 - Color
- ====================================================================
- The MD-DOS SCREEN_MODE anf the SCREEN_PALETTE need not change. If you are using
- Windows, the second ("else) SCREEN_MODE will determine your screen size. Note that
- the col/row numbers are in hexadecimal, thus 0x5020 is 80x32 in decimal. I have found
- that 0x5530 works best on an 800x600 resolution screen.
-
- 2. Text Representation
- Initially, the Text Representation section is given as follows:
- ====================================================================
- // Text representation
- //-------------------------------------------------------------------------
- OPCODE_BYTES = 0 // don't display bytes of instruction/data
- INDENTION = 16 // Indention of instructions
- COMMENTS_INDENTION = 40 // Indention for on-line comments
- MAX_TAIL = 16 // Tail depth
- MAX_XREF_LENGTH = 80 // Maximal length of line with cross-references
- MAX_DATALINE_LENGTH = 70 // Data directives (db,dw, etc):
- // max length of argument string
- SHOW_AUTOCOMMENTS = NO // Don't show silly comments
- SHOW_BAD_INSTRUCTIONS = NO // Don't bother about instruction lengthes
- SHOW_BORDERS = YES // Borders between data/code
- SHOW_EMPTYLINES = YES // Generate empty line to make
- // text more readable
- SHOW_LINEPREFIXES = YES // Show line prefixes (1000:0000)
- SHOW_SEGMENTS = YES // Show segments in addresses
- USE_SEGMENT_NAMES = YES // Show segment names instead of numbers
- SHOW_REPEATABLE_COMMENTS = YES // Of course, use repeatable comments
- // Disabling this increases IDA speed.
- SHOW_VOIDS = NO // Don't display <void> marks
- SHOW_XREFS = 2 // Show 2 cross-references
- SHOW_XREF_VALUES = YES // If not, xrefs are displayed
- // as "..."
- SHOW_SEGXREFS = YES // Show segment part of addresses
- // in cross-references
- SHOW_SOURCE_LINNUM = YES // Show source line numbers
- // (used in .obj files and java)
- SHOW_ASSUMES = YES // Generate 'assume' directives
- SHOW_ORIGINS = YES // Generate 'org' directives
- USE_TABULATION = YES // Use '\t' in output file
- ====================================================================
- Of course this section is modified to suit taste, and can be configured through the
- Options-Text Representation menu item (though changes made within IDA are saved only
- for the current project). I usually use the following changes:
- ====================================================================
- OPCODE_BYTES = 6 // I want the hex codes!
- INDENTION = 0 // Save some space
- COMMENTS_INDENTION = 30 // Save some space
- MAX_DATALINE_LENGTH = 100 // These can get long
- SHOW_BAD_INSTRUCTIONS = YES // bother about instruction lengthes
- SHOW_BORDERS = NO // why border?
- SHOW_EMPTYLINES = NO // These lines waste space
- SHOW_XREFS = 15 // Show a ton of cross-references
- SHOW_ORIGINS = NO // Hide 'org' directives
- ====================================================================
-
- 3. ASCII Strings & Names
- Here are the default settings that come with IDA:
- ====================================================================
- // ASCII strings & names
- //-------------------------------------------------------------------------
- ASCII_GENNAMES = YES // Generate names when making
- // an ASCII string
- ASCII_TYPE_AUTO = YES // Should IDA mark generated ascii names
- // as 'autogenerated'?
- // Autogenerated names will be deleted
- // when the ascii string is deleted
- // Also, they are displayed with the
- // same color as dummy names.
- ASCII_LINEBREAK = '\n' // This char forces IDA
- // to start a new line
- ASCII_PREFIX = "a" // This prefix is used when a new
- // name is generated
- #define ASCII_STYLE_C 0x00000000// Character-terminated ASCII string
- #define ASCII_STYLE_PASCAL 0x00000001// Pascal-style ASCII string (length byte)
- #define ASCII_STYLE_LEN2 0x00000002// Pascal-style, length has 2 bytes
- #define ASCII_STYLE_UNICODE 0x00000003// Unicode string
- ASCII_STYLE = ASCII_STYLE_C // Default is C-style
- ASCII_SERIAL = NO // Serial names are disabled
- ASCII_SERNUM = 0 // Number to start serial names
- ASCII_ZEROES = 0 // Number of leading zeroes in
- // serial names
- // type of generated names: (dummy names)
- #define NM_REL_OFF 0
- #define NM_PTR_OFF 1
- #define NM_NAM_OFF 2
- #define NM_REL_EA 3
- #define NM_PTR_EA 4
- #define NM_NAM_EA 5
- #define NM_EA 6
- #define NM_EA4 7
- #define NM_EA8 8
- #define NM_SHORT 9
- #define NM_SERIAL 10
- DUMMY_NAMES_TYPE = NM_REL_OFF
- MAX_NAMES_LENGTH = 15 // Maximal length of new names
- // (you may specify values up to 120)
- // Types of names that should be included into the list of names
- // (this list usually appears by pressing Ctrl-L)
- // normal 1
- // public 2
- // auto 4
- // weak 8
- LIST_NAMES = 0x07 // default: include normal, public, weak
- ...and a ton of demangling info...
- ====================================================================
- What's the big deal? It's only strings... Well, to tell the truth, a string is just
- a collection of bytes virtually indistinguishable--to the untrained eye--from opcode
- bytes. IDA will pick up a lot of strings, but it has to have a default string type...
- hence the ASCII_STYLE definition. This defaults to ASCII_STYLE_C, but you may want to
- change it to ASCII_STYLE__UNICODE if you will be dealing primarily with Windows 95/NT
- programs. [Note: You can change string types dynamically in IDA using the Options->ASCII
- Strings Style menu item, in case your target has multiple string types...notice also that
- from within IDA you can define different "end characters" from 1 to 2 bytes...this is very
- handy for special "internal" data types that some targets use.]
- Now, what about those weird name types? Here they are, translated:
- // normal 1: this shows internal functions, etc
- // public 2: this includes exports, entry points
- // auto 4: this shows the irritating IDA names
- // weak 8: this is useless ;)
- #define NM_REL_OFF 0 = loc_0_1234 segbase relative to prog base & offset from segbase
- #define NM_PTR_OFF 1 = loc_1000_1234 segment base address & offset from the segment base
- #define NM_NAM_OFF 2 = loc_dseg_1234 (*) segment name & offset from the segment base
- #define NM_REL_EA 3 = loc_0_11234 segment relative to base address & full address
- #define NM_PTR_EA 4 = loc_1000_11234 segment base address & full address
- #define NM_NAM_EA 5 = loc_dseg_11234 segment name & full address
- #define NM_EA 6 = loc_12 full address (no leading zeroes)
- #define NM_EA4 7 = loc_0012 full address (at least 4 digits)
- #define NM_EA8 8 = loc_00000012 full address (at least 8 digits)
- #define NM_SHORT 9 = dseg_1234 the same as (*) without data type specifier
- #define NM_SERIAL 10= loc_1 enumerated names (1,2,3...
-
- The first part determines what names are shown in the "Names" window; in general, the fewer the better.
- If you want the Names to show only the exports of the program, choose 0x02. The next section determines
- how internal addresses are referred to in the disassembled listing; if you like Sourcer's method
- of defining "location1, location2, etc" you should try defaulting to NM_SERIAL; if you like the location
- to show just the segment name and offset, use NM_SHORT. You can experiment with this using the Options->
- Name Representation menu item in IDA.
-
- I tend to set the following parameters:
- ASCII_TYPE_AUTO = NO
- ASCII_PREFIX = "str->"
- MAX_NAMES_LENGTH = 15
- LIST_NAMES = 0x03
- DUMMY_NAMES_TYPE = NM_SHORT
- **Note to use my "str->" prefix you will have to change the following line
- NameChars = "$?@" // asm specific character
- to
- NameChars = "$?@->" // asm specific character
- ...see #7 below. This setup will fill the Names window with strings, exports, and imports.
-
- 4. Processor Specific Parameters
- The PC-specific parameters for IDA are given as follows:
- ====================================================================
- #ifdef __PC__ // INTEL 80x86 PROCESSORS
- USE_FPP = YES
- // Floating Point Processor
- // instructions are enabled
- WINDIR = "c:\\windows" // Default directory to look up for
- // DLL files
- OS2DIR = "c:\\os2" // OS/2 main directory (is used to
- // look up DLLs)
- // IBM PC specific analyser options
- PC_ANALYSE_PUSH = YES // Convert immediate operand of "push" to offset
- // In sequence
- // push seg
- // push num
- // IDA will try to convert <num> to offset.
- PC_ANALYSE_NOP = YES // Convert db 90h after "jmp" to "nop"
- // Sequence
- // jmp short label
- // db 90h
- // will be converted to
- // jmp short label
- // nop
- PC_ANALYSE_MOVOFF = YES // Convert immediate operand of "mov reg,..." to offset
- // In sequence
- // mov reg, num
- // mov segreg, immseg
- // where
- // reg - any general register
- // num - a number
- // segreg - any segment register
- // immseg - any form of operand representing a segment paragraph
- // <num> will be converted to an offset
- PC_ANALYSE_MOVOFF2 = YES // Convert immediate operand of "mov memory,..." to offset
- // In sequence
- // mov x1, num
- // mov x2, seg
- // where
- // x1,x2 - any references to memory
- // <num> will be converted to an offset
- // translation used to build an ASCII string name by its contents
- // (now it is tuned for 866 codepage)
- // the order and number of the string constants is important!
-
- ... a bunch of XLat stuff...
-
- #endif // __PC__
- ====================================================================
- As you can, see, there are a few useful disassembly options here, most of which
- are already set. In fact, the only thing you should have to change is the following
- line:
- WINDIR = "c:\\windows\\system"
- This will correctly locate the WinAPI DLLs--it is very important to set this!
-
- 5. Keyboard HotKey Definitions
- This section is mostly a matter of personal taste, but I thought that I would draw attention to it.
- Here are the default keyboard shortcuts (you may want to print this out):
- "LoadFile" = 0 // Load additional file into database
- "LoadIdsFile" = 0 // Load IDS file
- "LoadDbgFile" = 0 // Load DBG file
- "LoadSigFile" = 0 // Load SIG file
- "Execute" = "F2" // Execute IDC file
- "ExecuteLine" = "Shift-F2" // Execute IDC line
- "Shell" = "Alt-Z"
- "About" = 0
- "SaveBase" = "Ctrl-W"
- "SaveBaseAs" = 0
- "Abort" = 0 // Abort IDA, don't save changes
- "Quit" = "Alt-X" // Quit to DOS, save changes
- "ProduceMap" = "Shift-F10" // Produce MAP file
- "ProduceAsm" = "Alt-F10"
- "ProduceLst" = 0
- "ProduceExe" = "Ctrl-F10"
- "ProduceDiff" = 0 // Generate difference file
- "DumpDatabase" = 0 // Dump database to IDC file
- "EditFile" = 0 // Small text editor
- "JumpAsk" = 'G'
- "JumpName" = "Ctrl-L"
- "JumpSegment" = "Ctrl-S"
- "JumpSegmentRegister" = "Ctrl-G"
- "JumpQ" = "Ctrl-Q"
- "JumpPosition" = "Ctrl-M"
- "JumpXref" = "Ctrl-X"
- "JumpOpXref" = "X"
- "JumpFunction" = "Ctrl-P"
- "JumpEntryPoint" = "Ctrl-E"
- "JumpEnter" = "Enter" // jump to address under cursor
- "Return" = "Esc"
- "UndoReturn" = "Ctrl-Enter" // undo the last Esc
- "EmptyStack" = 0 // make the jumps stack empty
- "SetDirection" = "Tab"
- "MarkPosition" = "Alt-M"
- "JumpVoid" = "Ctrl-V"
- "JumpCode" = "Ctrl-C"
- "JumpData" = "Ctrl-D"
- "JumpUnknown" = "Ctrl-U"
- "JumpExplored" = "Ctrl-A"
- "AskNextImmediate" = "Alt-I"
- "JumpImmediate" = "Ctrl-I"
- "AskNextText" = "Alt-T"
- "JumpText" = "Ctrl-T"
- "AskBinaryText" = "Alt-B"
- "JumpBinaryText" = "Ctrl-B"
- "JumpNotFunction" = "Alt-U"
- "MakeJumpTable" = "Alt-J"
- "MakeAlignment" = 'L'
- "MakeCode" = 'C'
- "MakeData" = 'D'
- "MakeAscii" = 'A'
- "MakeArray" = '*'
- "MakeUnknown" = 'U'
- "MakeVariable" = 0
- "SetAssembler" = 0
- "SetNameType" = 0
- "SetDemangledNames" = 0
- "SetColors" = 0
- "MakeName" = 'N'
- "MakeAnyName" = "Ctrl-N"
- "ManualOperand" = "Alt-F1"
- "MakeFunction" = 'P'
- "EditFunction" = "Alt-P"
- "DelFunction" = 0
- "FunctionEnd" = 'E'
- "OpenStackVariables" = "Ctrl-K" // open stack variables window
- "ChangeStackPointer" = "Alt-K" // change value of SP
- "MakeComment" = ':'
- "MakeRptCmt" = ';'
- "MakePredefinedComment" = "Shift-F1"
- "MakeExtraLineA" = "Ins"
- "MakeExtraLineB" = "Shift-Ins"
- "OpNumber" = '#'
- "OpHex" = 'Q'
- "OpDecimal" = 'H'
- "OpOctal" = 0
- "OpBinary" = 'B'
- "OpChar" = 'R'
- "OpSegment" = 'S'
- "OpOffset" = 'O'
- "OpOffsetCs" = "Ctrl-O"
- "OpAnyOffset" = "Alt-R"
- "OpUserOffset" = "Ctrl-R"
- "OpStructOffset" = 'T'
- "OpStackVariable" = 'K'
- "OpEnum" = 'M'
- "ChangeSign" = '-'
- "CreateSegment" = 0
- "EditSegment" = "Alt-S"
- "KillSegment" = 0
- "MoveSegment" = 0
- "SegmentTranslation" = 0
- "SetSegmentRegister" = "Alt-G"
- "SetSegmentRegisterDefault" = 0
- "ShowRegisters" = "Space"
- "OpenSegmentRegisters" = 0 // open various windows:
- "OpenSegments" = 0
- "OpenSelectors" = 0
- "OpenNames" = 0
- "OpenXrefs" = 0
- "OpenFunctions" = 0 // open functions window
- "OpenStructures" = 0 // open structures window
- "OpenEnums" = 0 // open enums window
- "OpenSignatures" = 0 // open signatures window
- "PatchByte" = 0
- "PatchWord" = 0
- "Assemble" = 0
- "TextLook" = 0 // set text representation
- "SetAsciiStyle" = "Alt-A" // set ascii strings style
- "SetAsciiOptions" = 0 // set ascii strings options
- "SetCrossRefsStyle" = 0 // set cross-referneces style
- "SetDirectives" = 0 // setup assembler directives
- "ToggleDump" = "F4" // show dump or normal view
- "SetAuto" = 0 // background analysis
- "ViewFile" = 0
- "Calculate" = '?'
- "ShowFlags" = 'F'
- "WindowOpen" = "F3"
- "WindowMove" = "Ctrl-F5"
- "WindowZoom" = "F5"
- "WindowPrev" = "Shift-F6"
- "WindowNext" = "F6"
- "WindowClose" = "Alt-F3"
- "WindowTile" = "F7"
- "WindowCascade" = "F8"
- "SetProcessor" = 0
- "AddStruct" = "Ins" // add struct type
- "DelStruct" = "Del" // del struct type
- "ExpandStruct" = "Ctrl-E" // expand struct type
- "ShrinkStruct" = "Ctrl-S" // shrink struct type
- "MoveStruct" = 0 // move struct type
- "DeclareStructVar" = "Alt-Q" // declare struct variable
- "AddEnum" = "Ins" // add enum
- "DelEnum" = "Del" // del enum
- "EditEnum" = "Ctrl-E" // edit enum
- "AddConst" = "Ctrl-N" // add new enum member
- "EditConst" = 'N' // edit enum member
- "DelConst" = 'U' // delete enum member
-
- Quite a few, eh? Basically, anything in IDA can have a hotkey. Note all of the 0's in the
- above list: these options have not hotkeys by default. It is generally good to set frequently-
- use operations (ASCII text representation, View Names, Search, etc) up as HotKeys, and to change
- hotkeys which make no sense into better menmonics.
-
- 6. Analysis Parameters
- IDA by default has the following Anaylsis Parameters set:
- // Analysis parameters
- //-------------------------------------------------------------------------
- ENABLE_ANALYSIS = YES // Background analysis is enabled
- SHOW_INDICATOR = YES // Show background analysis indicator
- #define AF_FIXUP 0x0001 // Create offsets and segments using fixup info
- #define AF_MARKCODE 0x0002 // Mark typical code sequences as code
- #define AF_UNK 0x0004 // Delete instructions with no xrefs
- #define AF_CODE 0x0008 // Trace execution flow
- #define AF_PROC 0x0010 // Create functions if call is present
- #define AF_USED 0x0020 // Analyse and create all xrefs
- #define AF_FLIRT 0x0040 // Use flirt signatures
- #define AF_PROCPTR 0x0080 // Create function if data xref data->code32 exists
- #define AF_JFUNC 0x0100 // Rename jump functions as j_...
- #define AF_NULLSUB 0x0200 // Rename empty functions as nullsub_...
- #define AF_LVAR 0x0400 // Create stack variables
- #define AF_TRACE 0x0800 // Trace stack pointer
- #define AF_ASCII 0x1000 // Create ascii string if data xref exists
- #define AF_IMMOFF 0x2000 // Convert 32bit instruction operand to offset
- #define AF_DREFOFF 0x4000 // Create offset if data xref to seg32 exists
- #define AF_FINAL 0x8000 // Final pass of analysis
- // See also ANALYSIS2, bit AF2_DODATA
- ANALYSIS = 0xFFFF // This value is combination of the defined
- // above bits.
- #define AF2_JUMPTBL 0x0001 // Locate and create jump tables
- #define AF2_DODATA 0x0002 // Coagulate data segs in the final pass
- ANALYSIS2 = 0x0001
- ====================================================================
- Generally, you will not need to change any of these parameters. In case you feel
- like playing with them, though, here is the IDA help file description of each:
-
- Create offsets and segments using fixup info
- IDA will use relocation information to make the disassembly
- nicer. In particular, it will convert all data items with
- relocation information to words or dwords like this:
- dd offset label
- dw seg seg000
- If an instruction has a relocation information attached to it,
- IDA will convert its immediate operand to an offset or segment:
- mov eax, offset label
- You can display the relocation information attached to the current
- item by using show @0:953[internal] flags command.
- Mark typical code sequences as code
- IDA knows some typical code sequences for each processor.
- For example, it knows about typical sequence
- push bp
- mov bp, sp
- If this option is enabled, IDA will search for all typical sequences
- and convert them to instructions even if there are no references
- to them. The search is performed at the loading time.
- Delete instructions with no xrefs
- This option allows IDA to undefine unreferences instructions.
- For example, if you @0:914[undefine] an instruction at the start of a
- function, IDA will trace execution flow and delete all instructions
- that lose references to them.
- Trace execution flow
- This options allows IDA to trace execution flow and convert all
- references bytes to @0:916[instructions].
- Create functions if call is present
- This options allows IDA to create @0:933[function] (proc) if a call
- instruction is present. For example, the presence of:
- call loc_1234
- leads to creation of a function at label loc_1234
- Analyse and create all xrefs
- Without this option IDA will not thoroughly analyse the program.
- If this option is disabled, IDA will simply trace execution flow,
- nothing more (no xrefs, no additional checks, etc)
- Use flirt signatures
- Allows usage of FLIRT technology
- Create function if data xref data->code32 exists
- If IDA encounters a data references from DATA segment to 32bit
- CODE segment, it will check for the presence of meaningful
- (disassemblable) instruction at the target. If there is an
- instruction, it will mark is as an instruction and will create
- a function there.
- Rename jump functions as j_...
- This option allows IDA to rename simple functions containing only
- jmp somewhere
- instruction to "j_somewhere".
- Rename empty functions as nullsub_...
- This option allows IDA to rename empty functions containing only
- a "return" instruction as "nullsub_..."
- (... is replaced by a serial number: 0,1,2,3...)
- Create stack variables
- This option allows IDA to automatically create stack variables and
- function parameteres.
- Trace stack pointer
- This option allows IDA to @0:743[trace] value of SP register.
- Create ascii string if data xref exists
- If IDA encounters a data reference to an undefined item, it
- checks for the presence of ASCII string at the target. If the length
- of ASCII string is big enough (more than 4 chars in 16bit or data
- segments; more than 16 chars otherwise), IDA will automatically create
- an @0:918[ASCII] string.
- Convert 32bit instruction operand to offset
- This option works only in 32bit segments.
- If an instruction has an immediate operand and the operand
- can be represented as a meaningful offset expression, IDA will
- convert it to an offset. However, the value of immediate operand
- must be higher than 0x10000.
- Create offset if data xref to seg32 exists
- If IDA encounters a data reference to 32bit segment and the target
- contains 32bit value which can be represented as an offset expression,
- IDA will convert it to an offset
- Make final analysis pass
- This option allows IDA to coagulate all @0:914[unexplored] bytes
- by converting them to data or instructions.
- Locate and create jump tables
- This option allows IDA to try to guess address and size of @0:863[jump]
- tables. Please note that disabling this option will not disable
- the recognition of C-style typical switch constructs.
- Coagulate data in the final pass
- This option is meaningful only if "Make final analysis pass"
- is enabled. It allows IDA to convert @0:914[unexplored] bytes
- to data arrays in the data segments. If this option is disabled,
- IDA will coagulate only code segments.
-
- 7. Character Translations and Allowed Character Lists
- The default character rules suppleid with IDA are as follows:
- ====================================================================
- // Character translations and allowed character lists
- //-------------------------------------------------------------------------
- // translation when ASCII string name is built using its contents
- XlatAsciiName =
- /*00..0F*/ "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F"
- /*10..1F*/ "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F"
- /*20..3F*/ " !\"# %&'()*+,-_/"
- "0123456789:;<=>?"
- /*40..5F*/ "@ABCDEFGHIJKLMNO"
- "PQRSTUVWXYZ[\\]^_"
- /*60..7F*/ "`abcdefghijklmno"
- "pqrstuvwxyz{|}~"
- /*80..9F*/ "ABVGDEJZIIKLMNOP"
- "RSTUFXCCSS I AUQ"
- /*A0..BF*/ "abvgdejziiklmnop"
- "ªªªªªªª++ªª+++++"
- /*C0..DF*/ "+--+-+ªª++--ª-+-"
- "---++++++++ª_ªª_"
- /*E0..FF*/ "rstufxccss i auq"
- "=▌==()~~▌++vn▌ªá";
- // the following characters are allowed in ASCII strings, i.e.
- // in order to find end of a string IDA looks for a character
- // which doesn't belong to this array:
- AsciiStringChars =
- "\r\n\a\v\b\t\x1B"
- " !\"#$%&'()*+,-./0123456789:;<=>?"
- "@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_"
- "`abcdefghijklmnopqrstuvwxyz{|}~"
- "▌nTGSastOdFne8-++µ▌(÷=v· +_óúÑPâ"
- "ßf=·±-¬▌+¼¼++í½+ªªªªªªª++ªª+++++"
- "+--+-+ªª++--ª-+----++++++++ª_ªª_"
- "a_GpSs▌tFTOd8fen";
-
- // the following characters are allowed in user-defined names:
- NameChars =
- "$?@" // asm specific character
- "_0123456789"
- "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
- "abcdefghijklmnopqrstuvwxyz";
- // the following characters are allowed in mangled names.
- // they will be substituted with the SubstChar during output if names
- // are output in a mangled form.
- MangleChars = "$:?([.)]" // watcom
- "@$%?" // microsoft
- "@$%"; // borland
- SubstChar = '_'
- ====================================================================
- Of these, two areas are of interest. The first is the "NameChars" section, which
- dictates which characters may be used for naming an address. For maximum flexibility
- (and to help make IDC scripts that automatically generate names run better), you may
- want to increase the characters in this section ot include the full range, i.e.
- "$?@"
- becomes:
- "$?@!#%^&*-+=~|\}{[]:;><,./"
- although this is strictly up to the user. The MangleChars section is also important for
- those working from code compiled with mangling set on; if the compiler of the target uses
- different mangling characters that the ones listed (rare), you can include them here--you
- can also change the character with which the mangled characters are replaced by changing
- the SubstChar value.
-
-
- Loading a program
- -----------------
- For all of the examples in this primer, I will be using notepad.exe as a target; I will
- also be assuming that the configuration changes mentioned above have been made. To begin,
- launch IDAW.EXE and type "c:\windows\notepad.exe" at the "Select File" dialog box, press
- OK.
-
- Immediately IDA will bring up a dialog box prompting you for loading options. Make sure that
- Portable Executable is checked (for Win32 files), that "Create Segments", "Load Resources", and
- "Make Imports Section" are checked, and that "Rename DLL Entries" is unchecked. Also ensure that
- the "DLL directory" is set to the location of kernel32.dll et. al., usually C:\windows\system.
-
- Press OK, and wait for the green "Ready" notice to appear in the upper left of the IDA menu bar.
-
- A few notes about the IDA user interface may be helpful at this point. IDA uses a text-mode windowing
- techniques common in console-mode applications; each window has a toborder with a green square (close),
- a title, and a green arrow (restore/mamximize), a right border with a veritcal scroll bar, and a bottom
- border with a horizontal scrollbar and a green corner (resize); the windows may be moved by dragging on
- the title bar, or resized by dragging on the green corner. F6 switches between windows (like Alt-Tab),
- F7 tiles all windows (except the Messages Windows, which is like a desktop), and F8 cascades all windows.
-
- Note that the disassembled listing is referred to as the Code Window or Text Window; you can open multiple
- views of the same program by selecting the View->Disassembly menu item, or by pressing F3.
-
- As with any Windows DOS box, clicking on the small MS-DOS icon (for the system menu) gives you an Edit
- submenu with Mark and Copy options; to copy text out of IDA and inot a windows editor, select Edit->Mark,
- highlight the text you want to copy, then select Edit->Copy, then go to the windows editor and Ctrl-V
- (or Edit->Paste) to insert the text selected from IDA.
-
- Viewing Imports
- ---------------
- All of the programs's imports will appear as names in the program, and may be viewed in the Names window
- by selecting the View->Names menu item; however as this contains all of the names in the program it may be
- a bit confusing. Double-clicking on the name of an inport will bring you to its entry in the .idata segment
- (see below).
-
- Another way to view the imports is to select the View->Segments menu item, which will bring up the Segments
- window. Double-click on the .idata segment; this will jump the disassembled listing to the start of the .idata
- segment, which will contain all of the program's imports in pink text. To the right of each import, at the end of
- the line, will be a list of addresses in the program which all that import. Double-clicking on one of these
- addresses will jump the disassembled listing to that address.
-
- Example: View the .idata segment of Notepad.exe as mentioned above. The imports are sorted by module; scroll down
- to the Kernel32.dll imports and find the one for "lstrcmpa". You should see a line like this:
- ª00407300 ?? ?? ?? ?? extrn lstrcmpA:dword ; DATA XREF: sub_401FAC+15
- ª00407300 ; sub_4045AF+3E^r
- ª00407300 ; .text:004046B9^r
- ª00407300 ; .text:004046DD^r
- Each of the locations after a ";" is an address in the file that calls lstrcmpa; these are known as cross-references,
- or X-refs for short. Double-click on the first one; note how it brings you to
- |00401FC1 FF 15 00 73 40 00 call ds:lstrcmpA
- |00401FC7 85 C0 test eax, eax
- |00401FC9 75 10 jnz short loc_401FDB
- Press Esc to go back to the lstrcmp entry, then double-click on each of the remaining X-refs to scope out the caller
- code. Note how you can scout out each caller routine by double clicking on call/jmp locations within the code, and by
- double-clicking on X-refs to see who initiated the caller routine; Esc, as always, returns you back the way you came,
- one step at a time.
-
- A final method of viewing exports is to write an IDC script. IDC is the IDA macro language; it stands for IDA-C much
- in the way that QCC stands for Quake-C. All IDA scripts must include the file IDC.IDC, which contains a number of
- internal IDA functions and constants. The IDC language is a lot like C, and is described in the file IDC.TXT--here is
- brief excerpt summarizing the language:
- ====================================================================
- IDC supports the following statements:
- if (expression) statement
- if (expression) statement else statement
- for ( expr1; expr2; expr3 ) statement
- while (expression) statement
- do statement while (expression);
- break;
- continue;
- return <expr>;
- return; the same as 'return 0;'
- { statements... }
- expression; (expression-statement)
- ; (empty statement)
- In expressions you may use almost all C operations except:
- ++,--
- complex assigment operations as '+='
- , (comma operation)
- Here is how a function is declared :
- static func(arg1,arg2,arg3) {
- ...
- }
- Here is how a variable is declared :
- auto var;
- ====================================================================
- That said and done, here is a script for listing the file's exports by API module
- to the IDA Messages window (the blue one with all of the yellow writing on it):
- ====================================================================
- //Imports.idc : Outputs list of imported functions to the Message Window
- #include <idc.idc>
-
- static GetImportSeg()
- {
- auto ea, next, name;
- ea = FirstSeg();
- next = ea;
- while ( (next = NextSeg(next)) != -1) {
- name = SegName(next);
- if ( substr( name, 0, 6 ) == ".idata" ) break;
- }
- return next;
- }
-
- static main()
- {
- auto BytePtr, EndImports;
- BytePtr = SegStart( GetImportSeg() );
- EndImports = SegEnd( BytePtr );
- Message(" \n" + "Parsing Import Table...\n");
- while ( BytePtr < EndImports ) {
- if (LineA(BytePtr, 1) != "") Message("\n" + "____" + LineA(BytePtr,1) + "____" + "\n");
- Message(Name(BytePtr) + "\n");
- BytePtr = NextAddr(BytePtr);
- }
- Message("\n" + "Import Table Parsing Complete\n");
- }
- ====================================================================
- The coding is pretty straight forward if you know C: the script finds the .idata segment,
- prints each non-blank anterior comment line (i.e., the line that tells what API module
- the following imports belong to), then prints the Name of each defined/named address in the
- .idata section. The script is executed by pressing F2 and selecting "imports.idc", assuming
- that you have saved the script as imports.idc in the \IDA37?\IDC directory.
-
- Viewing Exports
- ---------------
- Viewing exported functions s a little easier. Perhaps the quickest way is to select the Options-Name Representation
- menu item, and mark the "type of names" dialog so it includes only publics, as follows:
- Types of names included in the list of names:
- [ ] Normal
- [X] Public
- [ ] Autogenerated
- [ ] Weak
- Press Ok and then select the View->Names menu item; the Names window will now only contain the exported functions of
- the program. As with any of the Names/Segments/etc windows, double clicking on any line will bring that function up
- in the "code window". [Note: if you have modified the IDA.cfg file as mentioned above, you can also browse the imports
- in this manner by checking only "Normal" in the dialog box illustrated above, then ignoring everything with a "str->"
- prefix; the remainder will be imports.]
-
- If the program has an .edata segment, you can also view the exports there much in the same manner as in the .idata method
- given in the previous section. Note that Notepad has only one export ("start", the program entry point) and also has no
- .edata segment.
-
- The IDC method works for exports as well. The following ID script searches for entry points into the program and displays them
- in the message window:
- ====================================================================
- //exports.idc : display eprogram entry points to the message window
- #include <idc.idc>
-
- static main()
- {
- auto x, ord, ea;
- Message("\n Program Entry Points: \n \n");
- for ( x=0; x<= GetEntryPointQty(); x = x+1){
- ord = GetEntryOrdinal( x );
- ea = GetEntryPoint( ord );
- Message( Name( ea ) + ": Ordinal " + ltoa( ord,16 ) + " at offset " + ltoa( ea, 16) + "\n");
- }
- Message("\n" + "Export Parsing Complete\n");
- }
- ====================================================================
- Once again, this script may be run by pressing F2 and selecting "exports.idc".
-
- Viewing Strings/Resources
- -------------------------
- The strings can be previewed by selecting "Normal" as the "Type of names to be shown in the list of names" in the
- Options->Name Representation dialog box, and then looking for everything beginning with the prefix "str->" (or "a",
- if using IDA straight out of the box).
-
- In PE files, strings are commonly kept in a string table in the .rsrc segment. However, IDA does not by default
- parse the .rsrc segment for strings. Thus, an IDC script can be written to parse the .rsrc section for us, creating
- strings where any standard ASCII character is found so that the strings may be browsed either in the .rsrc segment,
- or in the names window:
- ====================================================================
- //RSRC_Strings.IDC
- //define all std ASCII characters in the .rsrc segment as strings
- #include <idc.idc> //This file contains all of the
- //function protos we will be using
- static main(){
- auto ea; //auto is the standard variable type
- ea = FirstSeg(); //Get Addr of first segment into ea
- while (ea !=BADADDR) {
- Message( "Analyzing " + SegName(ea) + "...\n" );
- //Is this the .RSRC segment? If so...
- if ( SegName(ea) == ".rsrc"){
- Message(" RSRC found!\n");
- while ( ea <= SegEnd(ea)) {
- //Change every Std ASCII character into a string
- if ( Byte(ea) > 0x19 && Byte(ea) < 0x7F){
- MakeStr( ea, -1 );
- MakeRptCmt(ea, Name(ea));
- ea = ea + ItemSize( ea );
- }
- else ea = ea + 1;
- }
- }
- ea=NextSeg(ea); //Goto Next Segment
- }
- Message("Done!\n");
- }
- ====================================================================
- The IDC script is functional, though not perfect (plenty of random bytes
- defined as strings, but it is quick up-and-running script). Notice that IDC.IDC
- contains a lot of function prototypes for use in IDC scripts; by including it, you
- are able to call all of the FirstSeg(), NextSeg(), etc functions. These functions
- are poorly documented, but the commented prototypes should give you enough to go
- on.
-
- The IDC script can be placed in the \IDC directory and run by pressing F2 and choosing
- the rsrc_strings.idc script. Note that this script assumes that you have the default
- string type set as "Unicode"; as such it will parse any Unicode resource names or values
- in the .rsrc statement. For a full-fledged resource parsing IDC script a lot more work is
- in order; I have started such a project with a script known as reslib.idc (too large to
- include here) which is publicly available.
-
- After running this script we can create and run a second one which will print out all of the
- strings (that is, every location name that begins with "str->") in the disassembled listing:
- ====================================================================
- //ss.idc : display all strings in the program
- #include <idc.idc>
-
- static main()
- {
- auto ea;
- ea = FirstSeg();
- Message("\n" + "Strings in Application: \n \n");
- while( ea != BADADDR) {
- if( substr( Name(ea), 0, 5) == "str->") {
- Message( substr(Name(ea), 5, -1) + " at address " + ltoa( ea, 16) + "\n" );
- }
- ea = NextAddr(ea);
- }
- Message("\n" + "String Listing Complete\n");
- }
- ====================================================================
- Running this after the previous IDC script will reveal the flaw in
- the first one: a lot of garabage ASCII bytes are listed as strings--more,
- in fact than there are actually strings. For this reason it is important
- to refine your scripts so they print out only the string table and resource
- names in the .rsrc section (as I have done with the reslib.idc script),
- rather than blindly naming locations.
-
- Searching for Strings/Code
- --------------------------
- Once you have defined strings, you can search for them using the Navigate->
- Search For->Text... menu item. For instance, entering the string "Cannot" at
- this dialog box will bring up the "YouCannotQuitWindows" string in the Code
- window. The shortcut for FindText is Alt-T, and for FindNextText is Ctrl-T. A
- "Pattern is not found" message will appear at the bottom of the message window
- when there are no more occurences of the text.
-
- What if your string has not been defined? If it is not Unicode, then you can
- search for it using Navigate->SearchFor->Text In Core... (Alt-B), by entering
- the string in quotes at the dialog box, as follows:
- +-[_]--------------- Binary search --------------------+
- ▌ ▌
- ▌ Enter search (down) string: ▌
- ▌ String "FindReplace" _▌▌
- ▌ ▌
- ▌ [X] Case-sensitive () Hex ▌
- ▌ ( ) Decimal ▌
- ▌ ( ) Octal ▌
- ▌ ▌
- ▌ OK _ Cancel _ F1 for Help_ ▌
- ▌ ________ ________ ____________ ▌
- +------------------------------------------------------+
- This will find occurences of "FindReplace" in the file. You can also search
- for the text using the hexadecimal equivalents of the ASCII characters:
- +-[_]--------------- Binary search --------------------+
- ▌ ▌
- ▌ Enter search (down) string: ▌
- ▌ String 46 69 6E 64 _▌▌
- ▌ ▌
- ▌ [X] Case-sensitive () Hex ▌
- ▌ ( ) Decimal ▌
- ▌ ( ) Octal ▌
- ▌ ▌
- ▌ OK _ Cancel _ F1 for Help_ ▌
- ▌ ________ ________ ____________ ▌
- +------------------------------------------------------+
- This will search for "Find" in the disassembled listing. In
- this way you can search for Unicode strings as well:
- +-[_]--------------- Binary search --------------------+
- ▌ ▌
- ▌ Enter search (down) string: ▌
- ▌ String 43 00 61 00 6E 00 6E _▌▌
- ▌ ▌
- ▌ [X] Case-sensitive () Hex ▌
- ▌ ( ) Decimal ▌
- ▌ ( ) Octal ▌
- ▌ ▌
- ▌ OK _ Cancel _ F1 for Help_ ▌
- ▌ ________ ________ ____________ ▌
- +------------------------------------------------------+
- This will search for the Unicode string "Cannot". Note that
- simply searching for the string "Cannot" will fail due to the
- 00 bytes that Unicode inserts between characters. Thus, to search
- effectively for Unicode strings, they must be defined first.
-
- Searching for code can be done in the same way, using the Text In Core
- method. For example, the following will search for "test eax, eax":
- +-[_]--------------- Binary search --------------------+
- ▌ ▌
- ▌ Enter search (down) string: ▌
- ▌ String 85 C0 _▌▌
- ▌ ▌
- ▌ [X] Case-sensitive () Hex ▌
- ▌ ( ) Decimal ▌
- ▌ ( ) Octal ▌
- ▌ ▌
- ▌ OK _ Cancel _ F1 for Help_ ▌
- ▌ ________ ________ ____________ ▌
- +------------------------------------------------------+
- And you can use the standard Text search for opcodes as well, though
- you will get a lot of hits (i.e., you can search for the text "test" but not "test eax, eax";
- therefore you will get quite a few hits).
-
- There is, of course, a final option to make searching for strings much easier--you must write an
- IDC script to front-end for the "Search for Text In Core" function. The following IDC script will
- do just that, allowing you to enter a text string to search for, then converting the string to
- hexadecimal and feeding it to the "Text In Core" function:
- ====================================================================
- // textsearch.idc : search for undefined strings
- #include <idc.idc>
-
- static main()
- {
- auto ea, x, y, searchstr, temp_c, binstr, array_id, alphabet, bin_c, cont;
- ea = FirstSeg();
- // ---- Create Array Of ASCII Characters ------------------------
- // ---- Note that the index of each char = its decimal value ----
- array_id = CreateArray("AsciiTable");
- alphabet = "0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz";
- y = 48;
- for (x = 0; x < strlen(alphabet); x = x + 1 ) {
- SetArrayString( array_id, y, substr(alphabet, x, x+1));
- y = y +1;
- }
- // ---- Prompt User For Search String ----------------------------
- searchstr = AskStr("", "Enter a search string:\n");
- // ---- Cycle through array looking for match --------------------
- for (x = 0; x < strlen(searchstr); x = x + 1 ) {
- temp_c = substr(searchstr, x, x + 1 );
- for( y = GetFirstIndex(AR_STR, array_id); y <= GetLastIndex(AR_STR, array_id); y = GetNextIndex(AR_STR, array_id, y) ) {
- if (temp_c == GetArrayElement(AR_STR, array_id, y)) {
- bin_c = y;
- break;
- } //End "If Match"
- } //End Array Loop
- binstr = form("%s %X", binstr, bin_c); //Standard Version
- //binstr = form("%s %X 00", binstr, bin_c); //Unicode Version
- } //End Search String Loop
- Message("Search string is " + binstr + "\n"); //Debug Control
- // -------- "Search" and "Search Again" Loop... --------------------
- cont = 1;
- while (cont==1) {
- ea = FindBinary(ea, 1, binstr); //Search From ea
- if( ea == -1) { //If No Hits
- Warning("No more occurrences"); //MessageBox
- cont = 0;
- break; //Leave
- }
- Jump(ea); //Position Cursor At Hit
- cont = AskYN( 1, "Find next occurence?" ); //Search Again?
- }
- // --------- Cleanup and Exit
- Message("\n" + "Search Complete\n");
- DeleteArray(array_id);
- }
- ====================================================================
-
- Location Names
- --------------
- In IDA, location names are your greatest asset. Naming locations whose purpose
- you know or suspect allows you to quickly browse the code for references to that
- location. For example, do the following:
- 1. Go to the lstrcmp import listing
- 2. Double Click on the first X-ref; this should put you at 00401FC1
- 3. Scroll up to the start of the function (401FAC) and use the N command to name it "StringCmpFunc"
- 4. Rename 401FDB to "StringCmpFailed" (because of the JNZ at 401FC9)
- 5. Name 402033 to "Good String Name" (for the JMP at 401FD9)
- Instantly the function is more readable. Now, go to the X-refs at 401FAC and double click on the
- first one; this will put you at 00402816 (yes, we are back-tracing! Great, isn't it?). Here you are in
- a great huge routine, and the "StringCmpFunc" stands out from the rest in bright yellow. The rest of the
- internal functions (sub_???????) can be named in the same way.
-
- Now some elementary searching browsing: You'll notice that you can see all of the names you created with
- the N command in the Names window. Using Alt-T (search text), you can look for occurences of StringCmpFunc
- in the disassembled listing, which will show you all of the locations that reference this function.
-
- Ok, comments: you can comment code using the ";" key. Go back to the "StringCmpFailed" location (look it up
- in the Names window), hit the ";" key and type in the text "Bad String Entered!". This is what is known as a
- "repeatable comment". Why repeatable? Because evey address that refers to this location will now have that comment
- suffixed to it--go back up to 401FC9 to verify. Cool, eh? You will never go back to W32Dasm...
-
- Producing an Output File
- ------------------------
- Producing an output file is relatively simple. If you want a full listing of the names, comments, addresses, in
- short everything in the Code Window, use File->Produce Output File->Produce LST File. If you just want the ASM
- source code, with no addresses, use File->Produce Output File->Produce ASM File. If you want to produce a tiny file
- that will make all of the changes that you just made to an executable (in case you want someone else to be able to
- duplicate your .idb [idb: IDA database, containing all of your changes to the exe and the disassembled listing]),
- use File->Produce Output File->Produce IDC file--this will create an IDc script that, when run, will leave the
- disassembled listing identical to yours.
-
- Advanced Technique
- ------------------
- 1.IDS files and Comment Databases
- Custom IDS files are very useful; you will need to download the IDS utilities from
- http://www.unibest.ru/~ig/idsutil.zip
- Basically, you create an IDT file from a .DLL by running the DLL2IDT utility. From there you can comment the
- IDT file and compress it into an IDS file using ZIPIDS, and finally move it to the appropriate subdirectory
- (based on OS) of \IDS.
-
- An IDT file looks like this:
- ALIGNMENT 4
- ;DECLARATION
- ;
- 0 Name=ADVAPI32.dll
- ;
- 1 Name=AbortSystemShutdownA
- 2 Name=AbortSystemShutdownW
- 3 Name=AccessCheck
- 4 Name=AccessCheckAndAuditAlarmA
- 5 Name=AccessCheckAndAuditAlarmW
- 6 Name=AddAccessAllowedAce
- 7 Name=AddAccessDeniedAce
- 8 Name=AddAce
- 9 Name=AddAuditAccessAce
- 10 Name=AdjustTokenGroups
- 11 Name=AdjustTokenPrivileges
- ...
-
- With this file, you can provide comments for various functions by adding "Comment=" lines to each, for example:
- 154 Name=RegCreateKeyA Comment=Create a Key in the System Registry
-
- Note that an IDT line has the following structure:
- Ordinal Name=name Args=args Drops=drops Pascal=pascal Typeinfo=type Comment=comment RptCmt=ord#
- The keywords are defined as follows:
- Name : name of entry point [string]
- Args : number of bytes occupied by entry point arguments [number]
- Drops : number of bytes purged from the stack upon return [number]
- Pascal : the same as Args=Drops= [number]
- Typeinfo : entry point function prototype (type of input/output arguments [string]
- Comment : a comment for this entry point [string]
- Rptcmt : use the comment from the specified entry point [number]
-
- Wouldn't it be nice to have all of the API prototypes entered as comments into the IDS files? Well, it can
- be done, though no-one in their right mind would attempt it by hand. One of the most basic programming tools,
- grep.exe, will allow you to search an entire directory for lines in any file containing a specific search pattern.
- If you were to grep an entire directory for WINAPI or STDCALL, you would then have as output a file with every
- 1-line API prototype in it. The following perl script will take an .idt file and grep output file, and output an
- .idt file commented with the API prototypes to stdout or a specified filename:
- ====================================================================
- #!/usr/bin/perl
-
- if ($#ARGV == 0) {
- print "Usage: h2idt [idtfile] [grepfile] [outfile]\n";
- print "Output defaults to stdout\n";
- exit (1);
- }
-
- $idtfile = $ARGV[0];
- $grepfile = $ARGV[1];
- if ($#ARGV == 2) {
- $outfile = ">" . $ARGV[2];
- } else {
- $outfile = ">-";
- }
- open(IDTFILE, $idtfile)|| die "Can't open file: $!\n";
- open(GREPFILE, $grepfile) || die "Can't open file: $!\n";
- open(OUTFILE, "$outfile") || die "Can't create file: $!\n";
- $i =0;
- foreach (<GREPFILE>){
- s/\n\r//;
- @greparray[$i] = $_;
- $i++;
- }
-
- print OUTFILE ";DECLARATION \n";
- print OUTFILE ";ALIGNMENT 2 \n\n";
- print OUTFILE "; Module Name and Description \n";
-
- foreach (<IDTFILE>) {
- if ( /^0/ ){
- s/\\//;
- print OUTFILE $_;
- print OUTFILE ";---------------------------------------\n";
- break;
- } elsif ( /Name=/ ){
- if (/\n/){
- chop; #get rid of LF
- }
- if (/\r/){
- chop; #get rid of CR
- }
- $outstr = $_;
- ($junk, $searchstr, $junk) = split(' ', $_, 3);
- $searchstr =~ s/Name=//;
- $comment='';
- foreach(@greparray) {
- if (/\s$searchstr\(/) {
- $comment = $_;
- }
- }
- $outstr =~ s/\\//;
- if ($comment != '') {
- $comment =~ s/^[^a-zA-Z]+//;
- $comment =~ s/\n//;
- $comment =~ s/;//;
- $comment =~ s/STDCALL\s//;
- print OUTFILE $outstr, " Comment=", $comment, "\n";
- }else {
- print OUTFILE $outstr, "\n";
- }
- }
- }
- print OUTFILE ";------------------EOF------------------";
- close(OUTFILE);
- close(GREPFILE);
- close(GREPTMP);
- close(IDTFILE);
- exit (0);
- ====================================================================
- As usual with Perl/unix files, strip the above for CR/LF's before you run it in perl (you can use
- Editeur, or nedit for this, depending on your OS). So, how do you do this from NT? Well, assuming
- you have the NT resource kit, the process for extracting and IDT file from an existing IDS file,
- grepping for prototypes (I use LCC as the protos are all 1-line), creating the commented IDT file
- and compressing it into an IDS file, is as follows:
-
- c:\ntreskit\posix\grep STDCALL c:\lcc\include\* > grep.out
- c:\ida\Utility\IDSUtil\WIN32\zipids -u c:\ida\Ids\Win\kernel32.ids
- c:\ntreskit\perl\perl.exe h2idt kernel32.idt grep.out idt.out
- c:\ida\Utility\IDSUtil\WIN32\zipids out.idt
- ren out.ids kernel32.ids
-
- You will get --in the IDT file-- output similar to the following:
- ===========================================================================
- ;DECLARATION
- ;ALIGNMENT 2
-
- ; Module Name and Description
- 0 Name=KERNEL32.dll
- ;---------------------------------------
- 50 Name=AddAtomA Pascal=2 Comment=ATOM AddAtomA(LPCSTR);
- 102 Name=AddAtomW Pascal=2 Comment=ATOM AddAtomW(LPCWSTR);
- 103 Name=AllocConsole Pascal=0 Comment=BOOL AllocConsole(VOID);
- 104 Name=AllocLSCallbac Comment=BOOL AllocConsole(VOID);
- ===========================================================================
- Note that eveyrthing after the "Comment=" will appear in the comment margin of IDA.
-
-
- In addition to the IDS files, you can also maintain a database of comments that
- will be inserted into the code upon disassembly. The IDA comment database is stored
- in the IDA.INT file, and it can be modified with the LoadINT utility available at
- http://www.unibest.ru/~ig/ldint37.zip
-
- The Readme file best documents how to edit this database, but to show you a brief example of
- the comments supplied with IDA, here is an excerpt from the PC section of the INT:
- // MMX instructions
- NN_emms: "Empty MMX state"
- NN_movd: "Move 32 bits"
- NN_movq: "Move 64 bits"
- NN_packsswb: "Pack with Signed Saturation (Word->Byte)"
- NN_packssdw: "Pack with Signed Saturation (Dword->Word)"
- NN_packuswb: "Pack with Unsigned Saturation (Word->Byte)"
- NN_paddb: "Packed Add Byte"
- NN_paddw: "Packed Add Word"
- NN_paddd: "Packed Add Dword"
-
- These comments will appear (if "auto comments" is turned on) whenever the
- opcode is encountered in the disassembly; note that you can browse through the .cmt files included
- with LoadINT to see what the existing comments are. The most interesting will be int.cmt, pc.cmt,
- portin.cmt, portout.cmt, and vxd.cmt. It is tempting --but rather daunting-- to port Ralph Brown's
- Interrupt List comments to an INT database...
-
- 2.IDC Scripts
- I have used IDC scripts for a number of monotonous tasks. Basically, you can use an IDC script to
- parse VCL resources, to parse VB forms (if you take the time...), to encrypt or decrypt sections
- of code, to print out a call trace, to perform searches for the user (e.g. a front-end to the RegEx
- feature), etc.
-
- Here are a quick few additional IDC scripts to demonstrate their usefulness:
- ====================================================================
- //copy.idc: Outputs selected text to an .asm file
- //Usage: Select text with mouse or cursor, hit F2 and type copy.idc, enter a filename when prompted
- // and the selected text will be written to that file.
- //Future Plans: Make this output to the Windows clipboard. I may have to patch IDA for this....
- //
- // code by mammon_ All rights reversed, use as you see fit.....
- //------------------------------------------------------------------------------------------------------
- #include <idc.idc>
-
- static main(){
- auto filename, start_loc, end_loc;
- start_loc = SelStart();
- end_loc = SelEnd();
- filename = AskFile( "asm", "Output file name?");
- WriteTxt( filename, start_loc, end_loc);
- return 0;
- }
- ====================================================================
- //------------------------------------------------------------------------------------------------------
- //Haeder.idc : Imports #defines from a .h file, adds as enums
- //Note: This script prompts the user for a header file (*.h), then parses the
- // file looking for #define statements: these are then converted to members
- // of enum "Defines".
- //Bugs: Only the first instance of any value will be preserved; all others will be
- // discarded with an error as you can have only one instance of any value (or
- // any name) in a single enumeration. A prompt has been added for the user to
- // name the enumerations for the header file, so that any duplicate enum values
- // can be added to a different file and enumerated under a different "enum name."
- //
- // code by mammon_ All rights reversed, use as you see fit.....
- //------------------------------------------------------------------------------------------------------
-
- #include <idc.idc>
-
- static strip_spaces( BytePtr, hHeaderFile){
- auto tempc;
- fseek( hHeaderFile, BytePtr, 0);
- tempc = fgetc(hHeaderFile);
- while ( tempc == 0x20) {
- BytePtr = BytePtr + 1;
- fseek( hHeaderFile, BytePtr, 0);
- tempc = fgetc(hHeaderFile);
- }
- return BytePtr;
- }
-
- static FindStringEnd( StrName ){
- auto x, tempc;
- for ( x = 1; x < strlen(StrName); x = x + 1) {
- tempc = substr( StrName, x-1, x);
- if ( tempc == " ") {
- return substr( StrName, 0, x);
- }
- }
- return substr( StrName, 0, strlen(StrName));
- }
-
- static FixString( StrName ){
- auto x, tempc, newname;
- newname="def"; //set newname to type character
- for ( x = 1; x < strlen(StrName); x = x + 1) {
- tempc = substr( StrName, x-1, x);
- if ( tempc != "_") {
- newname = newname + tempc;
- }
- }
- return newname;
- }
-
- static main(){
- auto HeaderFile, hHeaderFile, fLength, BytePtr, first_str, second_str, third_str, define_val;
- auto enum_id, tempc1, x, y, errcode, define_name, FilePtr, define_str, enum_name;
- FilePtr = 0;
- Message("\nStart Conversion\n");
- HeaderFile = AskFile( "*.h", "Choose a header file to parse:");
- enum_name = AskStr("Defines", "Enter a name for the enumerations (alpha only, eg 'VMMDefines'):");
- hHeaderFile = fopen( HeaderFile, "r");
- fLength = filelength(hHeaderFile);
- if( fLength == -1) Message( "Bad File Length!\n");
- enum_id = AddEnum( GetEnumQty() + 1, enum_name, FF_0NUMH);
- if ( enum_id == -1) {
- enum_id = GetEnum( enum_name );
- if(enum_id == -1) Message("Enum #Defines not created/not found\n");
- }
- SetEnumCmt( enum_id, "#define from " + HeaderFile, 1);
- while(FilePtr < fLength ){
- FilePtr = strip_spaces( FilePtr, hHeaderFile );
- BytePtr = FilePtr;
- errcode = fseek( hHeaderFile, BytePtr, 0 );
- if ( errcode != 0) break;
- first_str = readstr( hHeaderFile );
- if ( first_str == -1 ) {
- Message( "End of file! \n" );
- break;
- }
- else if ( substr(first_str, 0, 7) == "#define" || substr( first_str, 0, 7) == "#DEFINE" ) {
- FilePtr = FilePtr + strlen( first_str );
- BytePtr = BytePtr + 7;
- BytePtr = strip_spaces( BytePtr, hHeaderFile );
- errcode = fseek( hHeaderFile, BytePtr, 0 );
- if ( errcode != 0 ) break;
- second_str = readstr( hHeaderFile );
- if ( second_str == -1 ) {
- Message( "End of file after #define!\n" );
- break;
- }
- else {
- define_name = FindStringEnd( second_str );
- define_name = FixString( define_name );
- BytePtr = strip_spaces( BytePtr + strstr( second_str, " " ), hHeaderFile );
- errcode = fseek( hHeaderFile, BytePtr, 0);
- if ( errcode != 0 ) break;
- third_str = readstr( hHeaderFile);
- tempc1 = substr(third_str, 0, 2);
- if ( third_str == -1) {
- Message( "End of file before value!\n");
- break;
- }
- else if ( tempc1 == "0x" || tempc1 == "0X") {
- define_str = FindStringEnd( third_str );
- define_val = xtol( define_str );
- errcode = AddConst( enum_id, define_name, define_val);
- if ( errcode == 1 ) Message( "Name " + define_name + " bad or already used in program!\n");
- if ( errcode == 2 ) Message( "Value " + define_str + " already used in program!\n");
- if ( errcode == 3 ) Message( "Bad enumID!\n");
- }
- }
- }
- else FilePtr = FilePtr + strlen( first_str);
- }
- Message("\nConversion finished!\n");
- }
- ====================================================================
- //------------------------------------------------------------------------------------------------------
- //funcalls.idc : Display the calls made by a function
-
- #include <idc.idc>
-
- static main(){
- auto ea,x,f_end;
- ea = ChooseFunction("Select a function to parse:");
- f_end = FindFuncEnd(ea);
- Message("\n*** Code References from " + GetFunctionName(ea) + " : " + atoa(ea) + "\n");
- for ( ea ; ea <= f_end; ea = NextAddr(ea) ) {
- x = Rfirst0(ea);
- if ( x != BADADDR) {
- Message(atoa(ea) + " refers to " + Name(x) + " : " + atoa(x) + "\n");
- x = Rnext0(ea,x);
- }
- while ( x != BADADDR) {
- Message(atoa(ea) + " refers to " + Name(x) + " : " + atoa(x) + "\n");
- x = Rnext0(ea,x);
- }
- }
- Message("End of output. \n");
- }
- ===================================================================
- And, finally, I have referred to a reslib.idc file throughout this work. It can be found at
- http://www.eccentrica.org/Mammon/Reslib.idc
- with it's "caller file" at
- http://www.eccentrica.org/Mammon/Res.idc
-
- 3.Map files
- Map files may be generated by IDA using the File->Produce Output File->Produce Map File
- menu item. All of the user-created and auto-generated names (if selected) will be included
- as symbols in the .MAP files, which then can be converted into Soft-Ice symbol files using
- NMSYM.EXE.
-
- Note that there are a few tricks to this, I recommend using Gij's MaptoMap utility for the conversion.
-
- 4.ASM files
- The ASM files may be used to produce compilable source code. This is not, strictly speaking, the
- province of the cracker, but a bit of good practice can be found by taking various small .COM files
- (such as debug, edit, or the various Crack-me's) and re-compiling them.